NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Comprehensive Study of Privacy Risks in Curriculum Learning

https://doi.org/10.56553/popets-2025-0033

Qiongna_Chen, Joann; He, Xinlei; Li, Zheng; Zhang, Yang; Li, Zhou (January 2025, Proceedings on Privacy Enhancing Technologies)

Training a machine learning model with data following a meaningful order, i.e., from easy to hard, has been proven to be effective in accelerating the training process and achieving better model performance. The key enabling technique is curriculum learning (CL), which has seen great success and has been deployed in areas like image and text classification. Yet, how CL affects the privacy of machine learning is unclear. Given that CL changes the way a model memorizes the training data, its influence on data privacy needs to be thoroughly evaluated. To fill this knowledge gap, we perform the first study and leverage membership inference attack (MIA) and attribute inference attack (AIA) as two vectors to quantify the privacy leakage caused by CL. Our evaluation of 9 real-world datasets with attack methods (NN-based, metric-based, label-only MIA, and NN-based AIA) revealed new insights about CL. First, MIA becomes slightly more effective when CL is applied, but the impact is much more prominent to a subset of training samples ranked as difficult. Second, a model trained under CL is less vulnerable under AIA, compared to MIA. Third, the existing defense techniques like MemGuard and MixupMMD are not effective under CL. Finally, based on our insights into CL, we propose a new MIA, termed Diff-Cali, which exploits the difficulty scores for result calibration and is demonstrated to be effective against all CL methods and the normal training method. With this study, we hope to draw the community's attention to the unintended privacy risks of emerging machine-learning techniques and develop new attack benchmarks and defense solutions.
more » « less
Full Text Available
On Xing Tian and the Perseverance of Anti-China Sentiment Online

https://doi.org/10.1609/icwsm.v16i1.19348

Shen, Xinyue; He, Xinlei; Backes, Michael; Blackburn, Jeremy; Zannettou, Savvas; Zhang, Yang (June 2022, Proceedings of the International AAAI Conference on Web and Social Media)

Sinophobia, anti-Chinese sentiment, has existed on the Web for a long time. The outbreak of COVID-19 and the extended quarantine has further amplified it. However, we lack a quantitative understanding of the cause of Sinophobia as well as how it evolves over time. In this paper, we conduct a largescale longitudinal measurement of Sinophobia, between 2016 and 2021, on two mainstream and fringe Web communities. By analyzing 8B posts from Reddit and 206M posts from 4chan’s /pol/, we investigate the origins, evolution, and content of Sinophobia. We find that, anti-Chinese content may be evoked by political events not directly related to China, e.g., the U.S. withdrawal from the Paris Agreement. And during the COVID-19 pandemic, daily usage of Sinophobic slurs has significantly increased even with the hate-speech ban policy. We also show that the semantic meaning of the words “China” and “Chinese” are shifting towards Sinophobic slurs with the rise of COVID-19 and remain the same in the pandemic period. We further use topic modeling to show the topics of Sinophobic discussion are pretty diverse and broad. We find that both Web communities share some common Sinophobic topics like ethnics, economics and commerce, weapons and military, foreign relations, etc. However, compared to 4chan’s /pol/, more daily life-related topics including food, game, and stock are found in Reddit. Our finding also reveals that the topics related to COVID-19 and blaming the Chinese government are more prevalent in the pandemic period. To the best of our knowledge, this paper is the longest quantitative measurement of Sinophobia.
more » « less
Full Text Available
Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

https://doi.org/10.1007/978-3-031-19821-2_21

He, Xinlei; Liu, Hongbin; Gong, Neil Zhenqiang; Zhang, Yang (January 2022, European Conference on Computer Vision)

Full Text Available
Stealing Links from Graph Neural Networks

He, Xinlei; Jia, Jinyuan; Backes, Michael; Gong, Neil Zhenqiang; Zhang, Yang (January 2021, USENIX Security Symposium)

Full Text Available

Search for: All records